Analysis of short term price trends in daily stock-market index data 
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In financial time series there are periods in which the value increases or decreases monotonically. 
We call those periods elemental trends and study the probability distribution of their duration for 
the indices DJIA, NASDAQ and IPC. It is found that the trend duration distribution often differs 
from the one expected under no memory. The expected and observed distributions are compared 
by means of the Anderson-Darling test. 
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I. INTRODUCTION 

One of the goals of financial-market analysis is to pre- 
dict the future movements of prices and financial indices. 
In order to achieve this goal, a huge variety of meth- 
ods to forecast markets behavior were developed, ranging 
from complex mathematical models even to astrological 
pseudo-scientific techniques. An approach that has been 
recently growing in popularity is the statistical analysis 
of large sets of data, which has become now possible due 
to the increasing availability of computer power and high 
quality data sets. This approach has benefited from the 
contributions not only from economists, but also from 
many physicists and mathematicians who have applied 
methods and ideas of probability theory and statistical 
physics to finance. A set of nontrivial statistical prop- 
erties of historical data was observed and classified as 
"stylized facts" [3] , which are expected to provide a bet- 
ter insight on market structure and behavior. 

When observing the time series of the prices of an asset 
on a chart, it is common to see "trends" in which most 
of the values are greater (or smaller) than the previous 
ones. These trends are very popular within the so-called 
technical analysis. Trends as those studied by technical 
analysis can be seen as composed by smaller elemental 
trends, periods in which the value increases or decreases 
monotonically. These kind of trends are the ones that 
will be studied in the present work. Among other things, 
technical analysts seek patterns in the charts of financial 
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data, that are believed to be indicators of changes in the 
trend direction. The effectiveness of technical analysis is 
disputed and put at a stake by what is known as the Ef- 
ficient Market Hypothesis (EMH). Before going further, 
it is necessary to give some definitions. In Subsections 



I A IB and I C of this introduction these definitions and 



other useful information will be presented. In Section [Hj 
a model for the distribution of trends durations will be 
developed from the EMH. Section |III| w ill explain how 
the data were analyzed and section [lV| will provide an 
interpretation of the analysis. 



A. Definitions 

Let S(t) be the price of an asset or an index value at 
time t and X{t) — logS'(i) its logarithm. The log-return 
at time t is defined as: 



r(t,At) = X(t + At)-X(t) 



(1) 



for a given time sampling scale At. If the price variation 
is small, the log-return is a good approximation of the 
return 

S(t + At) - S(t) 



R(t, At) 



S(t) 



(2) 



In this paper, we consider At equal to 1 trading day and 
we use the daily close values of the indices to build the 
series S (t) . More details on the data sets will be given in 
section HID 

An elemental trend of duration k will be defined here as 
a subscrics of k + 1 values within the series S(t) in which 
every value is greater (for an uptrend) or smaller or equal 
(for a downtrend) than the preceding one (Figure]!]). The 
aim of this work is to study with a statistical approach 
the kind of short term trends defined above. 
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FIG. 1: The line segments join the starting and ending points 
of each elemental trend. 



B. The Efficient Market Hypothesis 

The EMH claims that the market quickly finds the ra- 
tional price for a traded asset |13j . The most important 
consequence of this hypothesis was shown by P. Samuel- 
son [T5] and it is the fact that the best forecast for the 
future price of an asset is its present price. 



E{S{t + At)\F t ) = S(t), 



(3) 



where E(-|J r t ) is the conditional expectation with respect 
to the filtration JF t , namely with respect to the known 
history up to time t. Indeed, it is easy to derive the 
EMH from a simple no-arbitrage argument. Suppose we 
have two assets, a risky one, with price S(t) and a risk- 
free one giving a constant interest rate rp. To avoid 
arbitrage, one has to require that the expected return of 
the risky asset is equal to the risk-free interes rate, that 



E(R(t,At)\F t ) = r F ; 



(4) 



the latter equation immediately yields, for non vanishing 
S(t), 



E{S{t + At)\F t ) = {l + r F )S(t), 



(5) 



which reduces to ^ for rp — 0. Equations (p]) and 
([5]), jointly with the integrability of the process S{t), are 
known as martingale and sub-martingale (remember that 
r F > 0) conditions, respectively. 

The EMH would invalidate the attempts of technical 
analysis to predict future prices or trends; in fact, in 
Samuelson's words, "there is no way of making an ex- 
pected profit by extrapolating past changes in the futures 
price, by chart or any esoteric devices of magic or mathe- 
matics" [15] as the best forecast of the future price would 
be the current price. 



C. Stylized facts 

As mentioned before, financial time series share some 
nontrivial statistical properties called stylized facts. Al- 



though those properties are often formulated qualita- 
tively, they are so constraining that it is difficult to repro- 
duce all of them by means of a stochastic process [3j . As a 
matter of fact, none of the market models, including ana- 
lytical models, Monte Carlo simulations and multi-agent 
based models, created before 1990, when awareness of 
such regularities gradually started to appear, could re- 
produce all of these stylized facts [H] . As an interesting 
issue, some studies suggest that stylized facts appear not 
only in financial time series, but also in other complex 
systems such as Conway's Game of Life [5]. To fix the 
ideas, some of the stylized facts, taken from reference [3J, 
are listed below: 

Absence of linear autocorrelations: 

Autocorrelations of returns are often negligi- 
ble, except for very small time scales, depending 
on the market and on the time horizon. 

Heavy tails: The return distribution is leptokurtic and 
some authors claim that the tails decay as a power- 
law. 

Gain-loss asymmetry: Large downward jumps in 
stock prices and stock index values are observed, 
but not equally large upward movements. (In ex- 
change rates there is a higher symmetry in up/down 
movements). 

Volatility clustering: High volatility events do cluster 
in time. 



II. AN 'EFFICIENT MARKET' MODEL FOR 
THE DURATION DISTRIBUTION 

Among all the possible martingale or sub-martingale 
models that can describe price fluctuations, the geomet- 
ric random walk is the simplest one. A geometric random 
walk is just a product of independent and identically dis- 
tributed positive random variables. If the expected value 
of these variables is 1, then the geometric random walk 
is a martingale; otherwise, if the expected value is larger 
than 1, the geometric random walk is a submartingale. 
However, the geometric random walk hypothesis is nei- 
ther necessary nor sufficient for an efficient market, as 
shown by many authors among whom Leroy jS], Lucas 
[7] and Lo and Mckinlay [SJ. To understand this point, 
it is enough to consider Equation ^ allowing for any 
martingale model. 

At each step of a series of index values, there are two 
possible outcomes: the index either increases or does not 
increase. In an efficient market, the expected future price 
depends only on information about the current price, not 
on its previous history. Therefore, it should be impossible 
to predict the expected direction of a future price change 
given the history of the price process. In formula, from 
Equation ^ (after discounting for the risk- free rate), we 
have 



E{S(t + At)- S(t)\T t ) =0; 



(6) 
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if we consider the sign of the price change Y(t, At) = 
sign(5(t + At) — S(t)), which coincides with the sign of 
returns, we accordingly have 



E(Y(i, At)) = 0. 



(7) 



If the price follows a geometric random walk, then the 
series of price-change signs can be modeled as a Bernoulli 
process. This process could be biased to take the pres- 
ence of a risk free interest rate into account. To be more 
specific, let us consider a log-normal geometric random 
walk and let us use the assumption At = 1. Let Sq be 
the initial price. The price at time t will be given by 



S(t) = SoHQi 



(8) 



where Qi arc independent and identically distributed ran- 
dom variables following a log-normal distribution with 
parameters [i and a. These two parameters come from 
the corresponding normal distribution for log-returns. As 
a direct consequence of the EMH in the form ^ , we have 

E(Q) = l+r F , (9) 

and for a log-normal distributed random variable, we 
have also 

E(Q) = e /i e°' 2/2 . (10) 
This leads to a dependence between the two parameters 



(1 = log(l + r F ) - 



(11) 



Note that, when rp = 0, it is impossible to get // = 0. 
This reflects a more general result, if the price process is a 
martingale, the log-price process cannot be a martingale 
and viceversa. Starting from the cumulative distribution 
function for a log-normal random variable 



Fq{u) =P(Q < u) 



1 



-erf 



log(u) - n 



2 V V2a 2 
the probability of a negative sign would be given by 

1 1 ./ a log(l + r F ) 



(12) 



Fq{1) 



<1) 



M4= 

2 V2^2 



ay/? 



(13) 

which yields q = 1/2 for rp = e CT I' 2 — 1. 

It becomes natural to use the biased Bernoulli process 
as the null hypothesis for the time series of signs [17] . 
It is well known that the distribution of the number k 
of failures needed to get one success for a Bernoulli pro- 
cess with success probability p = 1 — q is the geometric 
distribution G(p); the number of failures N is given by 



P(k) = ¥{N = k) =p(l- p y 



pq 



(14) 



The duration of an elemental downward trend is the num- 
ber of days before the price increases, so the distribution 
of such trend durations should follow a geometric distri- 
bution. An identical argument applies to the duration of 
an upward trend. Such sequences of identical outcomes 
are also known as runs or clumps in the mathematical 
literature. 



III. METHODOLOGY 



Data 



Three indices were analyzed, namely Dow Jones In- 
dustrial Average (DJIA), NASDAQ Composite and the 
Mexican Indice de Precios y Cotizaciones (IPC) during 
the periods 1 October 1928 - 7 July 2011, 5 February 1971 
- 30 September 2011 and 30 October 1978 - 7 April 2011, 
respectively. The data were taken from Yahoo Finance. 



B. Price and time scales 

The prices were expressed in terms of "constant 
money" using the consumer price indices from references 
[TU] and [TO]. Since the values of those indices were de- 
livered monthly, linear interpolation was used in order 
to express the daily values. The time was measured in 
trading days as discussed above. 



C. Building the sample 

For each series of index values, several time windows of 
1000 trading days were created, each one shifted forward 
by one trading day with respect to the previous one. This 
procedure resulted in 20785 time windows for the DJIA, 
10282 for the NASDAQ and 4997 for the IPC. Two his- 
tograms were built for each time window, one for upward 
trends and the other one for downward trends. For up- 
ward trends, 500 points within each time window were se- 
lected at random, and the number of days before the first 
time decrease were measured. Such waiting times were 
the entries for each histogram. For instance, given the 

sequence h + + + H — , assume that the fourth entry 

is randomly chosen. Then the recorded waiting time is 
4. The histograms for downward trends were built in the 
same way. Finally, all the histograms were normalized. 
Examples of both uptrend and downtrend histograms for 
different time windows are shown in Figure [3j 



D. The Anderson-Darling goodness of fit test 

In order to compare the observed and expected dis- 
tributions of trend durations, the Anderson-Darling test 
described in references [U [2] was used. The Anderson- 
Darling test was found to be the most suitable for this 
purpose because it places more weight on the tails of a 
distribution than other goodness of fit tests. The critical 
values of the Anderson-Darling statistic A 2 n were depen- 
dent on the parameter of the geometric distribution, and 
they were estimated using Monte Carlo simulations. 



4 



Ratio Upward/Total Price Changes 



o 



0.6 












0.55 












0.5 








V *-/A 




0.45 




- 

/ 




DJI 

NASDAQ 




0.4 




ff 




IPC 




0.35 




t 

y 








0.3 













85 90 95 00 05 10 

Time (Years) 



FIG. 2: (Color online) Ratio of upward to total price changes 
in daily data, plotted against time for the years 1980 - 2011, 
calculated over a time window of 1000 trading days. 

IV. DISCUSSION 

In Figure [2j the ratio of the upward to total price 
changes in daily data is plotted against time for the years 
1980 - 2011. This ratio is calculated over a time window 
of 1000 trading days. It can be seen that variations are 
greater than those expected for the same time windows in 
a Bernoulli process with parameter p = 1/2 (±0.05), but 
it might be interesting to find out whether the hypoth- 
esis of a geometric distribution holds for smaller periods 
(such as each of the 1000 days time windows individu- 
ally), because it would mean that in those periods the 
direction of price changes was not predictable using his- 
torical prices of the index. Figure [3] show the distribution 
of trend durations corresponding to different indices and 
periods of 1000 days. Figure [4] displays the p- values of the 
Anderson-Darling statistic for different periods. In order 
to avoid confusion between the parameter of the geomet- 
ric distribution and the p-values for the distribution of 
A 2 n , the latter will be referred to as ir- values. The mean- 
ing of 7r-values is the probability of obtaining a value of 
A\ at least as big as the one that was really obtained, 
given that the probability distribution is actually geo- 
metric. 

It was observed that as time passes, the direction of 
price changes for the IPC and the Nasdaq is better de- 



scribed by a geometric distribution (Figure |4j) . The dis- 
tribution of trend durations for the Dow Jones is gener- 
ally reasonably well fitted by the geometric distribution. 
This fact can be interpreted as a possible evidence that 
the Mexican stock market (that has become public and 
regulated since 1975) has been increasing its efficiency, 
as reported by previous research [3]. The same claim 
can be made about the NASDAQ, given that it is also a 
market of relatively recent creation. In contrast, the Dow 
Jones Industrial Average index represents a more mature 
market. However, there is also evidence that the New 
York Stock Exchange, represented by the Dow Jones, has 
swiftly increased its efficiency between the beginning of 
the 1980s and the end of the 1990s [Hj. Figure H shows 
that for the Dow Jones, the greatest deviations from the 
geometric distribution in the studied period (almost the 
whole XXth Century) occurred between the years 1960- 
1980. 



V. CONCLUSIONS 

The probability distributions for the duration of ele- 
mental trends were studied for the market indices Dow 
Jones Industrial Average (DJI A), NASDAQ Composite 
and for the Mexican Indice de Precios y Cotizaciones 
(IPC). These distributions are expected to be geometric 
and memoryless according to the discussion in section [TT] 
The IPC and the NASDAQ present periods in which the 
memoryless hypothesis must be definitely rejected. 
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FIG. 4: (Color online) Top left: 7r-values of the Anderson-Darling statistic for the NASDAQ plotted against time. As time 
passes, the data agree better with the geometric distribution; top right: tt- values of the Anderson-Darling statistic for the IPC 
plotted against time. As for NASDAQ, agreement between data and the geometric distribution increases with time; bottom: 
7r-values of the Anderson-Darling statistic for the DJIA plotted against time. The greatest deviations from the geometric 
distribution occurred between the years 1960-1980. 
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